Search CORE

415 research outputs found

Effect of Tuned Parameters on a LSA MCQ Answering Model

Author: A. C. Graesser
Alain Lifchitz
C. H. Q. Ding
D. I. Martin
G. Denhière
G. Salton
G. Salton
Guy Denhière
J. Diaz
J. Diaz
J. Quesada
M. Efron
M. F. Porter
M. W. Berry
S. Deerwester
S. T. Dumais
S. T. Dumais
Sandra Jhean-Larose
W. Kintsch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

This paper presents the current state of a work in progress, whose objective is to better understand the effects of factors that significantly influence the performance of Latent Semantic Analysis (LSA). A difficult task, which consists in answering (French) biology Multiple Choice Questions, is used to test the semantic properties of the truncated singular space and to study the relative influence of main parameters. A dedicated software has been designed to fine tune the LSA semantic space for the Multiple Choice Questions task. With optimal parameters, the performances of our simple model are quite surprisingly equal or superior to those of 7th and 8th grades students. This indicates that semantic spaces were quite good despite their low dimensions and the small sizes of training data sets. Besides, we present an original entropy global weighting of answers' terms of each question of the Multiple Choice Questions which was necessary to achieve the model's success.Comment: 9 page

arXiv.org e-Print Archive

Towards an OpenSource Logger for the Analysis of RPA Projects

Author: A Jimenez-Ramirez
D Gusfield
HP Fung
JG Enríquez
L Reinkemeyer
L Willcocks
S Aguirre
S Dumais
S Singh
T Taulli
W Aalst
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Process automation typically begins with the observation of humans conducting the tasks that will be eventually automated. Sim ilarly, successful RPA projects require a prior analysis of the undergo ing processes which are being executed by humans. The process of col lecting this type of information is known as user interface (UI) logging since it records the interaction against a UI. Main RPA platforms (e.g., Blueprism and UIPath) incorporate functionalities that allow the record ing of these UI interactions. However, the records that these platforms generate lack some functionalities that large-scale RPA projects require. Besides, they are only understandable by the proper RPA platforms. This paper presents an extensible and multi-platform OpenSource UI logger that generate UI logs in a standard format. This system collects information from all the computers it is running on and sends it to a central server for its processing. Treatment of the collected information will allow the creation of an enriched UI log which can be used, among others purposes, for smart process analysis, machine learning training, the creation of RPA robots, or, being more general, for task mining .Ministerio de Economía y Competitividad TIN2016-76956-C3-2-R (POLOLAS)Junta de Andalucía CEI-12-TIC021Centro para el Desarrollo Tecnol´ogico Industrial (CDTI) P011-19/E0

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Privacy-Preserving Similarity-Based Text Retrieval

Author: Agrawal R.
Anderson R.
Bao F.
Bawa M.
Berchtold S.
Chor B.
Dingledine R.
Dumais S. T.
Dumais S. T.
ElGamal T.
Husbands P.
Hweehwa Pang
Jialie Shen
Kim J.
Ramayya Krishnan
Song D. X.
Wang S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2010
Field of study

Article No.: 4</p

Crossref

Institutional Knowledge at Singapore Management University

Transcriptional landscape of the human and fly genomes: Nonlinear and multifunctional modular model of transcriptomes

Author: Bell I.
Cheng J.
Cheung E.
Dike S.
Drenkow J.
Dumais E.
Duttagupta R.
Ganesh M.
Ghosh S.
Gingeras T. R.
Helt G.
Kapranov P.
Manak J. R.
Nix D.
Piccolboni A.
Sementchenko V.
Tammana H.
Willingham A. T.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2006
Field of study

Regions of the genome not coding for proteins or not involved in cis-acting regulatory activities are frequently viewed as lacking in functional value. However, a number of recent large-scale studies have revealed significant regulated transcription of unannotated portions of a variety of plant and animal genomes, allowing a new appreciation of the widespread transcription of large portions of the genome. High-resolution mapping of the sites of transcription of the human and fly genomes has provided an alternative picture of the extent and organization of transcription and has offered insights for biological functions of some of the newly identified unannotated transcripts. Considerable portions of the unannotated transcription observed are developmental or cell-type-specific parts of protein-coding transcripts, often serving as novel, alternative 5′ transcriptional start sites. These distal 5′ portions are often situated at significant distances from the annotated gene and alternatively join with or ignore portions of other intervening genes to comprise novel unannotated protein-coding transcripts. These data support an interlaced model of the genome in which many regions serve multifunctional purposes and are highly modular in their utilization. This model illustrates the underappreciated organizational complexity of the genome and one of the functional roles of transcription from unannotated portions of the genome. Copyright 2006, Cold Spring Harbor Laboratory Press © 2006 Cold Spring Harbor Laboratory Press

Cold Spring Harbor Laboratory Institutional Repository

Nonlinear transmission through a tapered fiber in rubidium vapor

Author: Abraham
Alexandrov
Balykin
Birks
Birks
Bloembergen
Dumais
Ghosh
Hatakeyama
J. D. Franson
Kien
Kien
Klempt
Leunga
Marinelli
Meucci
Nayak
S. M. Hendrickson
Siddons
Spillane
Spillane
T. B. Pittman
Tong
Warken
Publication venue: 'The Optical Society'
Publication date: 01/12/2008
Field of study

Sub-wavelength diameter tapered optical fibers surrounded by rubidium vapor can undergo a substantial decrease in transmission at high atomic densities due to the accumulation of rubidium atoms on the surface of the fiber. Here we demonstrate the ability to control these changes in transmission using light guided within the taper. We observe transmission through a tapered fiber that is a nonlinear function of the incident power. This effect can also allow a strong control beam to change the transmission of a weak probe beam.Comment: 10 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Machine Learning in Automated Text Categorization

Author: ANDROUTSOPOULOS I.
ATTARDI G.
BAKER L.D.
BIEBRICHER P.
CAROPRESO M.F.
CAVNAR W.B.
CHAKRABARTI S.
CLACK C.
CLEVERDON C.
COHEN W. W.
COHEN W. W.
COHEN W.W.
DAGAN I.
DEERWESTER S.
DENOYER L.
DIAZ ESTEBAN A.
DRUCKER H.
DUMAIS S.T.
DUMAIS S.T.
ESCUDERO G.
Fabrizio Sebastiani
FIELD B.
FORSYTH R. S.
FUHR N.
FUHR N.
FUHR N.
FURNKRANZ J.
GALAVOTTI L.
GALE W. A.
GOVERT N.
GRAY W.A.
GUTHRIE L.
HAYES P.J.
HEAPS H.
HERSH W.
HULL D. A.
HULL D. A.
ITTNER D.J.
IWAYAMA M.
IYER R.D.
JOACHIMS T.
JOACHIMS T.
JOACHIMS T.
JOHN G. H.
JUNKER M.
JUNKER M.
KESSLER B.
KIM Y.-H.
KLINKENBERG R.
KNORZ G.
KOLLER D.
LAM S.L.
LAM W.
LAM W.
LANG K.
LARKEY L. S.
LARKEY L. S.
LARKEY L.S.
LEWIS D. D.
LEWIS D. D.
LEWIS D. D.
LEWIS D. D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LEWIS D.D.
LI H.
LI Y.H.
LIERE R.
LIM J. H.
MASAND B.
MASAND B.
MCCALLUM A. K.
MCCALLUM A.K.
MLADENIC D.
MLADENIC D.
MOULINIER I.
MOULINIER I.
MYERS K.
NG H.T.
OH H.-J.
PAZIENZA M. T.
RILOFF E.
ROBERTSON S.E.
ROBERTSON S.E.
ROTH D.
RUIZ M.E.
SABLE C.L.
SARACEVIC T.
SCHAPIRE R. E.
SCHUTZE H.
SCHUTZE H.
SCOTT S.
SEBASTIANI F.
SINGHAL A.
SLONIM N.
TAIRA H.
TUMER K.
TZERAS K.
VAN RIJSBERGEN C. J.
WIENER E.D.
YANG Y.
YANG Y.
YANG Y.
YANG Y.
YU K.L.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2001
Field of study

The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

arXiv.org e-Print Archive

CiteSeerX

Crossref

Computational Indistinguishability between Quantum States and Its Cryptographic Application

Author: A. Bogdanov
A. Kawachi
A. Kawachi
A.C.-C. Yao
Akinori Kawachi
C. Crépeau
C. Crépeau
C. Moore
C. Moore
C.H. Bennett
D. Aharonov
D. Bacon
D. Boneh
D. Mayers
D. Mayers
D. Micciancio
D. Robinson
E.M. Luks
G. Kuperberg
G.M. Nikolopoulos
G.M. Nikolopoulos
H. Kobayashi
H.-K. Lo
Harumichi Nishimura
I. Damgård
J. Grollmann
J. Kempe
J. Köbler
J. Watrous
M. Adcock
M. Ajtai
M. Ajtai
M. Bellare
M. Blum
M. Crâsmaru
M. Ettinger
M. Grigni
M. Hayashi
M. Tompa
M.A. Nielsen
O. Regev
O. Regev
O. Regev
P. Dumais
P.W. Shor
P.W. Shor
R. Impagliazzo
S. Goldwasser
S. Goldwasser
S. Hallgren
S. Hallgren
T. Okamoto
Takeshi Koshiba
Tomoyuki Yamakami
U. Schöning
V. Arvind
W. Diffie
W. Höffding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/03/2011
Field of study

We introduce a computational problem of distinguishing between two specific quantum states as a new cryptographic problem to design a quantum cryptographic scheme that is "secure" against any polynomial-time quantum adversary. Our problem, QSCDff, is to distinguish between two types of random coset states with a hidden permutation over the symmetric group of finite degree. This naturally generalizes the commonly-used distinction problem between two probability distributions in computational cryptography. As our major contribution, we show that QSCDff has three properties of cryptographic interest: (i) QSCDff has a trapdoor; (ii) the average-case hardness of QSCDff coincides with its worst-case hardness; and (iii) QSCDff is computationally at least as hard as the graph automorphism problem in the worst case. These cryptographic properties enable us to construct a quantum public-key cryptosystem, which is likely to withstand any chosen plaintext attack of a polynomial-time quantum adversary. We further discuss a generalization of QSCDff, called QSCDcyc, and introduce a multi-bit encryption scheme that relies on similar cryptographic properties of QSCDcyc.Comment: 24 pages, 2 figures. We improved presentation, and added more detail proofs and follow-up of recent wor

arXiv.org e-Print Archive

Crossref

Modeling and Inferring Cleavage Patterns in Proliferating Epithelia

Author: A Classen
Ankit B. Patel
B Dubertret
BI Shraiman
C Bertet
D Kwiatkowska
DW Thompson
F-T Lewis
F-T Lewis
J Dumais
J Feldman
J Miller
J Resino
J Zallen
Jeffrey Axelrod
L Baena-Lopez
M Théry
Matthew C. Gibson
MG Gibson
R Cowan
R Farhadifar
Radhika Nagpal
RS Smith
S Bohn
William T. Gibson
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

The regulation of cleavage plane orientation is one of the key mechanisms driving epithelial morphogenesis. Still, many aspects of the relationship between local cleavage patterns and tissue-level properties remain poorly understood. Here we develop a topological model that simulates the dynamics of a 2D proliferating epithelium from generation to generation, enabling the exploration of a wide variety of biologically plausible cleavage patterns. We investigate a spectrum of models that incorporate the spatial impact of neighboring cells and the temporal influence of parent cells on the choice of cleavage plane. Our findings show that cleavage patterns generate “signature” equilibrium distributions of polygonal cell shapes. These signatures enable the inference of local cleavage parameters such as neighbor impact, maternal influence, and division symmetry from global observations of the distribution of cell shape. Applying these insights to the proliferating epithelia of five diverse organisms, we find that strong division symmetry and moderate neighbor/maternal influence are required to reproduce the predominance of hexagonal cells and low variability in cell shape seen empirically. Furthermore, we present two distinct cleavage pattern models, one stochastic and one deterministic, that can reproduce the empirical distribution of cell shapes. Although the proliferating epithelia of the five diverse organisms show a highly conserved cell shape distribution, there are multiple plausible cleavage patterns that can generate this distribution, and experimental evidence suggests that indeed plants and fruitflies use distinct division mechanisms

CiteSeerX

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Embellishing Text Search Queries to Protect User Privacy

Author: Adar E.
Baeza-Yates R.
Barbaro M.
Benaloh J. C.
Dumais S. T.
Husbands P.
Joho H.
Kushilevitz E.
Song D. X.
Text TREC.
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2010
Field of study

Users of text search engines are increasingly wary that their activities may disclose confidential information about their business or personal profiles. It would be desirable for a search engine to perform document retrieval for users while protecting their intent. In this paper, we identify the privacy risks arising from semantically related search terms within a query, and from recurring highspecificity query terms in a search session. To counter the risks, we propose a solution for a similarity text retrieval system to offer anonymity and plausible deniability for the query terms, and hence the user intent, without degrading the system’s precision-recall performance. The solution comprises a mechanism that embellishes each user query with decoy terms that exhibit similar specificity spread as the genuine terms, but point to plausible alternative topics. We also provide an accompanying retrieval scheme that enables the search engine to compute the encrypted document relevance scores from only the genuine search terms, yet remain oblivious to their distinction from the decoys. Empirical evaluation results are presented to substantiate the effectiveness of our solution. 1

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

The Effect of Mindfulness-based Programs on Cognitive Function in Adults: A Systematic Review and Meta-analysis

Author: Acabchuk R
Arenaza-Urquijo EM
Barnhofer T
Bottcher A
Britton W
Chetelat G
Cohen A
Coll-Padros N
Collette F
Dautricourt S
Demnitz-King H
Dumais T
Klimecki O
Lazar SW
Lee M
Lutz A
Marchant NL
Meiberth D
Moitra E
Moulinet I
Muller T
Parsons E
Sager L
Sannemann L
Scharf J
Schild A-K
Schlosser M
Touron E
Vago D
Walker Z
Whitfield T
Wirth M
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 04/08/2021
Field of study

Mindfulness-based programs (MBPs) are increasingly utilized to improve mental health. Interest in the putative effects of MBPs on cognitive function is also growing. This is the first meta-analysis of objective cognitive outcomes across multiple domains from randomized MBP studies of adults. Seven databases were systematically searched to January 2020. Fifty-six unique studies (n = 2,931) were included, of which 45 (n = 2,238) were synthesized using robust variance estimation meta-analysis. Meta-regression and subgroup analyses evaluated moderators. Pooling data across cognitive domains, the summary effect size for all studies favored MBPs over comparators and was small in magnitude (g = 0.15; [0.05, 0.24]). Across subgroup analyses of individual cognitive domains/subdomains, MBPs outperformed comparators for executive function (g = 0.15; [0.02, 0.27]) and working memory outcomes (g = 0.23; [0.11, 0.36]) only. Subgroup analyses identified significant effects for studies of non-clinical samples, as well as for adults aged over 60. Across all studies, MBPs outperformed inactive, but not active comparators. Limitations include the primarily unclear within-study risk of bias (only a minority of studies were considered low risk), and that statistical constraints rendered some p-values unreliable. Together, results partially corroborate the hypothesized link between mindfulness practices and cognitive performance. This review was registered with PROSPERO [CRD42018100904]

UCL Discovery